Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music

نویسندگان

  • Sigurður Sigurðsson
  • Kaare Brandt Petersen
  • Tue Lehn-Schiøler
چکیده

In large MP3 databases, files are typically generated with different parameter settings, i.e., bit rate and sampling rates. This is of concern for MIR applications, as encoding difference can potentially confound meta-data estimation and similarity evaluation. In this paper we will discuss the influence of MP3 coding for the Mel frequency cepstral coeficients (MFCCs). The main result is that the widely used subset of the MFCCs is robust at bit rates equal or higher than 128 kbits/s, for the implementations we have investigated. However, for lower bit rates, e.g., 64 kbits/s, the implementation of the Mel filter bank becomes an issue.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shape-based Spectral Contrast Descriptor

Mel-frequency cepstral coefficients are used as an abstract representation of the spectral envelope of a given signal. Although they have been shown to be a powerful descriptor for speech and music signals, more accurate and easily interpretable options can be devised. In this study, we present and evaluate the shape-based spectral contrast descriptor, which is build up from the previously prop...

متن کامل

Noise-Robust Speech Features Based on Cepstral Time Coefficients

In this paper, we investigate the noise-robustness of features based on the cepstral time coefficients (CTC). By cepstral time coefficients, we mean the coefficients obtained from applying the discrete cosine transform to the commonly used mel-frequency cepstral coefficients (MFCC). Furthermore, we apply temporal filters used for computing delta and acceleration dynamic features to the CTC, res...

متن کامل

Singing/humming System through Query Proportion

Query by Singing/Humming (QBSH) is a Music Information Retrieval (MIR) system with small audio excerpt as query. The rising availability of digital music stipulates effective music retrieval methods. Further, MIR systems support content based searching for music and requires no musical acquaintance. Current work on QBSH focuses mainly on melody features such as pitch, rhythm, note etc., size of...

متن کامل

An Extensive Analysis of Query by Singing/Humming System Through Query Proportion

Query by Singing/Humming (QBSH) is a Music Information Retrieval (MIR) system with small audio excerpt as query. The rising availability of digital music stipulates effective music retrieval methods. Further, MIR systems support content based searching for music and requires no musical acquaintance. Current work on QBSH focuses mainly on melody features such as pitch, rhythm, note etc., size of...

متن کامل

Improving the noise-robustness of mel-frequency cepstral coefficients for speech processing

In this paper we study the noise-robustness of mel-frequency cepstral coefficients (MFCCs) and explore ways to improve their performance in noisy conditions. Improvements based on a more accurate model of the early auditory system are suggested to make the MFCC features more robust to noise while preserving their class discrimination ability. Speech versus non-speech classification and speech r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006